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Abstract —Stochastic digital backpropagation (SDBP) is an 
extension of digital backpropagation (DBP) and is based on 
the maximum a posteriori principle. SDBP takes into account 
noise from the optical amplifiers in addition to handling de¬ 
terministic linear and nonlinear impairments. The decisions in 
SDBP are taken on a symbol-by-symbol (SBS) basis, ignoring 
any residual memory, which may be present due to non-optimal 
processing in SDBP. In this paper, we extend SDBP to account for 
memory between symbols. In particular, two different methods 
are proposed: a Viterbi algorithm (VA) and a decision directed 
approach. Symbol error rate (SER) for memory-based SDBP 
is significantly lower than the previously proposed SBS-SDBP. 
For inline dispersion-managed links, the VA-SDBP has up to 10 
and 14 times lower SER than DBP for QPSK and 16-QAM, 
respectively. 

Index Terms —Digital backpropagation, factor graphs, near- 
MAP detector, nonlinear compensation, optical communications. 


I. Introduction 

D IGITAL backpropagation (DBP) has been proposed as 
a universal technique for jointly compensating for the 
intra-channel linear and nonlinear impairments in the coherent 
fiber-optic system El-0. As a result, the DBP has been 
used to benchmark schemes proposed in the literature ||6l- 
E2). The assumed optimality of DBP has spurred intense 
research in low-complexity variations, including weighted 
DBP, perturbation DBP, and filtered DBP Qoj, HD. While 
the focus of the current paper is on single-channel sys¬ 
tems, for wavelength division multiplexing (WDM) systems, 
DBP is typically employed for the center channel, thereby 
accounting only for the intra-channel effects. Inter-channel 
nonlinear effects in WDM systems can be modeled by taking 
the advantage of the temporal correlations of the nonlinear 
phase noise using a time-varying system with inter-symbol 
interference (ISI) and thereby compensating for these inter¬ 
channel nonlinear effects C2)-E). While DBP has received 
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a great deal of attention, it only deals with deterministic linear 
and nonlinear impairments and inherently does not consider 
noise. It is known that the nondeterministic nonlinear effects, 
such as nonlinear signal-noise interaction (NSNI) between 
the transmitted signal and the amplified spontaneous emission 
(ASE) noise, limit the transmission performance of a fiber¬ 
optic system flOi . Hi 51 . 11 6l . Studies on the impact of NSNI 
reveal that there is a significant penalty due to NSNI for inline 
optical dispersion-managed (DM) links, and the severity of the 
NSNI is dependent on modulation formats and the symbol rate 
used in the system IfTTI . lfj~8l . It is often argued that NSNI 
cannot be compensated for in digital signal processing (DSP) 
due to the nondeterministic nature of ASE noise ESI and as 
a result, none of the DBP methods account for NSNI. To deal 
with stochastic disturbances, Bayesian detection theory can be 
used to formulate maximum a posteriori probability (MAP) 
detectors, which are provably optimal in terms of minimizing 
the error probability. MAP detectors have been proposed 
for the discrete memoryless channel EH assuming perfect 
chromatic dispersion (CD) compensation in a DM link, and a 
look-up table detector that can mitigate data-pattern-dependent 
nonlinear impairments 11201 . A low-complexity Viterbi detector 
is suggested as an alternative or to complement DBP for 
combating fiber nonlinearities OTl . In J22), the stochastic 
digital backpropagation (SDBP) algorithm was proposed to 
compensate not only for deterministic linear and nonlinear 
effects, but also to account for the ASE noise. However, the 
decisions were taken on a symbol-by-symbol (SBS) basis after 
applying a matched filter. This approach was later shown to 
be suboptimal l23l . 

In this paper, we extend l22ll to address the sub-optimality 
in SBS-SDBP by explicitly accounting for residual memory, 
which may be present due to matched filtering in SDBP. In 
particular, we propose two different methods based on the 
Viterbi algorithm (VA) and a decision-directed (DD) approach. 
The VA approach is similar to EQ, but does not rely on 
sending long training sequences to learn the VA branch 
metrics. The DD approach uses previously decoded symbols 
when taking the decisions for the current symbol and as a 
result is computationally less complex than the VA approach. 
Extensive simulation results indicate significant performance 
improvements over DBP and SBS-SDBP, in particular for 
DM links. While the proposed algorithm is computationally 
complex, we believe this receiver can serve as an inspiration 
to design low-complexity approaches that still significantly 
outperform DBP. 
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The remainder of this paper is organized as follows. We first 
describe the underlying mathematical framework on which 
SDBP is built, namely factor graphs (FGs) and message¬ 
passing algorithms, in Sec. [Tl] In Sec. [Till the system model is 
detailed. Sec.[IV]is devoted to the description of the VA and 
DD approaches, as well as how these are incorporated into 
SDBP framework. We present numerical results in Sec. [V] 
followed by our conclusions in Sec. [VT] 

Notation: Lower case bold letters (e.g., x) are used to 
denote vectors, including sequences of symbols and vector 
representations of continuous-time signals (e.g., through over- 
sampling). The transpose of the vector v is denoted by v T . 
A multivariate Gaussian probability density function (PDF) of 
a real variable r with mean fj, and covariance matrix S is 
denoted by _AA(r; /r, £). 

II. Factor Graphs for Receiver Design 

Optimal receiver design for digital communications in terms 
of minimizing the error probability is based on a MAP 
criterion. However, MAP detectors can be computationally 
intractable (often with exponential complexity in the dimen¬ 
sionality of the unknown variable), except for certain com¬ 
munication systems. For this reason, much research effort has 
been devoted in developing near-MAP detectors, which can 
balance near-optimal performance with reasonable computa¬ 
tional complexity. A practical framework that has emerged 
since the early 2000s as a general and automated way to 
develop near-MAP detectors is that of FGs l24l . (25) . An FG 
is a graph that describes the statistical relation between the 
variable of interest (i.e., the unknown transmitted data) and 
the observation (i.e., the received waveform). By performing 
a message-passing algorithm on such an FG, it is possible to 
determine the MAP estimate, or an approximation thereof. 

FGs have been widely used in wireless communication as 
they provide a methodology to design receivers in a systematic 
and near-automated way HD, (27). Some applications in¬ 
clude message-passing decoders for low-density parity-check 
(LDPC) and turbo codes (28), iterative demodulation and de¬ 
coding for bit-interleaved coded modulation (29), joint equal¬ 
ization and decoding ED, channel estimation ED, timing 
synchronization (32), and phase-noise recovery [331. It should 
be noted that while FGs generally lead to the most powerful 
known receiver algorithms with polynomial complexity, they 
are often too complex to be implemented as is. For that reason, 
FGs are often used as a first approach from which practical 
algorithms can be developed with lower complexity (30). 

In the context of coherent fiber-optic communication, FGs 
have only seen limited utilization, mainly due to their high 
computational complexity. Examples include demodulation 
(M), decoding | [35) , equalization (36) , and computation of 
information rates 63- Nevertheless, FGs can serve as a good 
basis to develop low-complexity receivers. A key application 
is the design of a near-MAP detector in the nonlinear regime, 
as proposed in (22). The resulting receiver, coined SDBP, 
showed significant performance gains over DBP. In SDBP, the 
posterior distribution is obtained by marginalizing the joint 
distribution of input, all intermediate, unobserved variables in 



Fig. 1. A fiber link with N spans where each span consists of an 
SMF, a DCM module (for DM links), and EDFAs. 

the channel, and the received signal, over the unobserved vari¬ 
ables. The statistical relationship between all these variables 
can be described with a FG (in this case a Markov chain), 
and the marginalization is performed by message passing. 
However, the FG-based SDBP receiver proposed in (22) is 
not a true MAP receiver due to a number of heuristic design 
choices that were made: (i) a matched filter followed by 
symbol-rate sampling was employed, similar to DBP, which 
is suboptimal and may not give rise to sufficient statistics 
and may exhibit residual memory; (ii) decisions were made 
on a SBS basis, ignoring any residual memory; and (iii) for 
each SBS decision, a Gaussian approximation of messages 
was introduced. The first issue was addressed in (38) . while 
a possible solution to the third issue was discussed in (23) . 
considering alternative distributions to a Gaussian in Cartesian 
coordinates. An alternative approach would be to use a non- 
parametric approach with a kernel, where the kernel bandwidth 
is a free parameter that should be tuned (39) . The second issue 
will be addressed in this work. 

III. System Model 

The system that will be considered is a single-channel fiber¬ 
optic system as shown in Fig. [T] comprising a dual-polarization 
transmitter block (Tx), including a pulse shaper, a fiber-optic 
link with N spans, and a receiver block (Rx) that implements 
a compensation algorithm followed by a decision unit. Each 
span of the fiber-optic link consists of a transmission fiber, 
which is a standard single-mode fiber (SMF) and an optional 
dispersion-compensating module (DCM) for DM links. In 
between fiber spans, there are erbium-doped fiber amplifiers 
(EDFAs) that compensate for the losses in the previous fiber. 
As indicated in Fig. Q] the transmitted data is denoted by s, 
the decoded data by s, and the received signal by r. The noise 
and gain of the EDFAs are resp., denoted by w n , and G,, 
where i G {1,2} corresponds to EDFA1 and EDFA2. 

A sequence of K four-dimensional symbols s = 
[si, S 2 , ■ • ■, Sif] T G fl K is transmitted at a symbol rate 1 /T s 
with a pulse-shaping filter g(t), where C K 4 is the set of 
symbols in the four-dimensional constellation, consisting of in- 
phase and quadrature data from the x and y polarizations. The 
overall goal of the receiver is to optimally recover s from r. 
While different optimality criteria can be considered, we aim 
to minimize the error probability, leading to a MAP receiver, 
in which the estimate of s is 

s = arg max p(s|r), (1) 

sGQ k 

where p(s|r) is the a posteriori probability distribution of s 
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given the received signal r. Note that in all derivations, we 
will consider all signals and vectors to be real. 

IV. SDBP and Proposed Approaches 

As indicated in Sec. I, message passing on an FG relies on 
local dependencies. In a fiber, the lowest level of local deter¬ 
ministic dependencies that one can exploit are (i) the linear and 
nonlinear operation from the split-step Fourier method (SSFM) 
and (ii) the statistical dependency between input and output of 
the EDFAs, including ASE noise. Considering the signals after 
each linear and nonlinear step of each segment of each span of 
the SSFM (22] Fig. 1], and the signals after each EDFA, all as 
unobserved variables, one can factorize the joint distribution 
of the transmitted data s, the received waveform r, and these 
unobserved variables. The posterior distribution in G} can thus 
be interpreted as a marginalization of the joint distribution 
p(s, unobserved variables|r). This marginalization from the 
joint distribution to p(s|r) is done using the framework of 
FGs and a message-passing algorithm called sum-product 
algorithm (SPA) (28). 

Definition 1 (Particle representation of a distribution). A list 
of particles (or samples) x^ 2 ), ...,x^ Np \ denoted by 

form a particle representation of a distribution 
p(x) when p(x) « l/N p Jfn=i <5( x — x (n )). 

Definition 2 (Distributions associated with particles). With a 
list of particles (x^™) defined over R 4M , we associate two 

distributions: g c (x) is a distribution (obtained, e.g., through 
a parametric approximation) defined over R 4A/ for which 
the particles form a sample representation, while <?d( * (i) (ii) * * * * * * * x ) A u 
distribution defined only over Cl M , with <?d(x) oc g c (x). 

A. SDBP 

The main idea of SDBP is to marginalize out the unobserved 
variables through computing messages, which describe statis¬ 
tically (i.e., in the form of a distribution) the uncertainty of the 
corresponding variable. This allows us to obtain a description 
ofp(s|r). The messages are computed backwards, starting with 
the received signal r at span N of the fiber-optic link of Fig. Q] 
until the transmitter is reached. 

For the fiber-optic channel, closed-form expressions of the 
distributions are not possible to derive except for some specific 
scenarios, so the message/distribution is represented with a list 
of N p particles. These particles are propagated at each stage 
of the fiber-optic link in Fig. |T| starting from r as described 
below. 

We start with the known received waveform r in Fig. 1, 
which exhibits no uncertainty and is thus represented by N p 
identical particles (line 2 in Algorithm 1). These particles are 
passed through the inverse of the EDFA2 block of the last span 
to get a collection of particles, which describe the uncertainty 
regarding the variable before EDFA2 (line 4 in Algorithm 1). 
The particles are then back propagated through the inverse 
of the SSFM of the DCM (line 5 in Algorithm 1), where 
SSFMj -1 (r l7l '>) implements the inverse SSFM for entire fiber 
span. The particles are then back propagated through EDFA1 
(line 6 in Algorithm 1) and through the inverse of the SSFM 


Algorithm 1 Pseudo-code for implementation of SDBP 


l: procedure SDBP(r) 

2: r Vn t> create N p replicas of r 


= N to 1 do 

> Iteration 

over spans 

'(") (rW + w^/Vft 

Vn 

>EDFA2 

■(") SSFMj-^rV 1 )) 

Vn 

> DCM 

'(") «- (rW+w'f)/^ 

Vn 

>EDFA1 

■W SSFM^rW) 

Vn 

> SMF 


8 : end for 

9: MF(r( ra )) Vn > MF followed by sampling 

at symbol rate 

10 : end procedure 


of the SMF (line 7 in Algorithm 1). This process is repeated 
for all N spans. Note that when N p = 1 and = 0 

for all n, these steps are identical to DBP 

As a final step (line 9 in Algorithm 1), SDBP must compute 
the message related to the transmitted data s, based on the mes¬ 
sage describing the waveform after pulse shaping. A heuristic 
approach has been used in ( 22 ), where each particle waveform 
is passed through a matched filter (MF), matched to the pulse 
shape, and sampled at the symbol ratt£] at the optimal sampling 
times leading to N p particles, {g( ra )}^ 1 , with §(") 6 K 4 ^. 
The particles can be viewed as samples from a 

distribution q c (s) (defined for s € K 4if ), for which < 7 d(s) 
(defined for s £ fl K ) provides an approximation of p(s|r). 
It is important to note that < 7,1 (s) is only an approximation of 
p(s|r) and need not be identical to p(s|r), as the use of a 
MF followed by sampling at the symbol rate is a heuristic. 
Hence, performing SBS decisions on the marginals of q c ( s) 
as in ll 22 l may not lead to optimal performance (in terms 
of minimizing the probability of error, either symbol-wise or 
sequence-wise). In fact, alternatives to a MF were explored in 
(381 , indicating performance improvements. In this paper, we 
propose to exploit residual memory present due to non-optimal 
processing, by making a decision regarding s based on the 
entire distribution q c (s), rather than its marginals, leading to 
the following detector 

s = arg max g d (s), ( 2 ) 

s£Q k 

where again q^(s ) cx q c (s). Solving © is hard for two reasons: 

(i) the number of possible sequences, fl K , is exponential in K, 
making the optimization infeasible for large values of I \, and 

(ii) for any specific sequence in f \ K , e/ ( | (s) is hard to determine 

since we only have particles representing q c {s). In 

order to address the first issue, we impose a Markov structure 

onto q,\ (s). To solve the second issue, the set of particles 

is smoothed with a distribution, which will be discussed in 

Sec. IIV-BI We now present two approaches that use variations 

of © to make decisions on s. 


A matched filter maximizes the signal-to-noise ratio for a signal affected 

by AWGN noise l40l Ch. 101. 
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B. VA-SDBP 


Assuming that q d (s) follows a Markov structure with mem- 
or>@ L > 0, we define x k = [s k -i, s k - 2 , ■ ■ ■, s k - L ] T and 
y k = [sk X J] T - Then qd(s) can be factorized as 


K 


*?d(s) — 11 qd (Sk | $k— 15 Sk—2-> • • • > l) 

fc=L+l 
K 

= f[ < 7 d(sfe|xfc). 


(3) 


fc=L+l 

Using Bayes’ rule, qd(s k \x k ) can be written as 

t i ^ <?d(sfe,x fe ) g d (y fc ) 

qd [Sk Xfc) = - 7 -T— = - r. (4) 

9d(x/c) gd(xfc) 

Using 0 and (0), 0 can be rewritten as 

s = arg max{ln( 7 d(s)} 
sen K 

= arg max < Y [lng d (y fc ) - lng d (xfc)] > 

senK U=i J 

K 

= arg min y^V>fc(sfc,x fc ), (5) 

sen 

fc— 1 

where ip k is the branch metric and x/ i: is the state used in the 
VA. 

The values of <7d(yfc) and g d (xfc) can be computed by 
marginalizing q d (s). However, we have access only to q c (s) 
through the particles {s^}^=i- Denoting the appropriate sub¬ 
sequences from s (n '> by y[ n 1 and x [7 we find that q c (y k ) ~ 
1 / N pT l n=i S (Yk - yi” } ) and q c (x k ) « 1/N p X^=i <5(x fc - 
x k '' > ). In order to evaluate g d (yfc) and <?d(xfc), we can impose a 
parametric approximation for q c {y k ) and q c (x k ) for which the 
logarithm is easy to compute. The Gaussian distribution is such 
a parametric approximation^ Hence q c {y k ) = J\f (y k ] S^) 
and q c (x k ) = A f(x k \fi k , ££). The means /r^, fi k and covari¬ 
ances Ej£, are estimated as 

N„ „ N v 


“'* = jrT. y 


1 


P n=l 
AT„ 


(n) 

■v-' ' 

fe 


y ' y — 

k N P - 1 , 


XAi 1 -/**)&£" -mD t > 




~—r XA7 - Hfe)(x[ n) - Mfc) T - ( 6 ) 
"'p 1 n =i 

The factors in 0 can be written as 

Qd(yfc) oc exp |-i (y fc - n y k f (S^)" 1 (y fc - Mfc)}> ^ 7 ) 

g d (x fc ) oc exp j-i (x fc - /r^) T (S^) -1 (x k - Mfc) j- ( 8 ) 


2 The memory L is a tuning parameter, where larger L will lead to higher 
complexity and better performance. 

3 A non-parametric approach with a kernel can be used as an alternative 
03, where the kernel bandwidth is a free parameter that should be tuned. 


As a result, ijt k can be simplified as 

V>fc(s fc ,x fc ) cx(y k - n v k f CElr 1 (yfc - n v k ) 

- (xfc - /4 ) t (ED" 1 (Xfc - mD ■ W 

To find an estimate s using (0, a VA0 is used with the current 
state as x k and the current symbol as s k with branch metric 
as in 0 for the fcth symbol slot. Observe that since the search 
space for s is fl K , the search space for y k is V. L+] and x k £ 
Vl L . When L = 0, VA-SDBP reverts back to SBS-SDBP from 

m. 


C. DD-SDBP 

The second approach, DD-SDBP, combines the idea of 
exploiting memory, as in VA-SDBP, with taking decisions on 
a SBS basis, as in SBS-SDBP. In this approach, the previously 
decoded symbols, [sfc_i, Sfc_ 2 ,..., Sfc_fc] T , are used while 
taking decisions for the current symbol s k and as a result 
Xfc in 0 can be interpreted as a constant that does not affect 
the optimization in 0. Thus, decisions on s k in DD-SDBP 
are taken as 

s fc = arg max q d (s k |s fc -i, s fe _ 2 , ■ ■ ■, s k -L ) 

= arg max ^(sfc, Xfc), ( 10 ) 

s k eQ 

which can be solved recursively starting from k = L+l using 
a similar Gaussian approximation as in VA-SDBP. However, in 
contrast to VA-SDBP, DD-SDBP can only account for causal 
memory effects. Note that the search space for yfc of 0 is 
fl instead of Q L+1 as the decisions have to be taken only for 
Sfc. When L = 0, DD-SDBP also reverts to SBS-SDBP. 

V. Numerical Simulations and Discussion 
A. Simulation Setup and Performance Metrics 

The simulation setup is shown in Fig. |T] The pulse shape 
used at the transmitter is a root-raised-cosine pulse with a roll¬ 
off factor of 0.25 and truncation length of 16 symbol periods. 
The simulations are performed for a polarization-multiplexed 
signal, with no polarization mode dispersio 10 . either with 16- 
QAM or QPSK as modulation format and symbol rates R s of 
14 Gbaud, 28 Gbaud, and 56 Gbaud. In each polarization, K = 
4096 symbolfQ are transmitted in each block of the Monte 
Carlo simulation. This signal is input to the channel with N 
spans. The parameters used for the SMF are D = 16 ps/(nm 
km), 7 = 1.3 (W km)^ 1 , a = 0.2 dB/km, which are according 
to the ITU-T G.652. We have considered a fiber Bragg grating 
(FBG) as a DCM0 Propagation in the fibers is simulated using 
the SSFM with a segment length Bll of A = (cL^Ly )) 1 / 3 , 
where e = 10 -4 , Ln = 1 /( 7 P) is the nonlinear length, Ld = 
T 2 2 / kc/(\D\\ 2 ) is the dispersion length, A is the wavelength, 

4 We assume a uniform a priori distribution over all the states which implies 
the symbols at the start of trellis are unknown. 

5 For k = 1, ..., L, decisions on s k are taken on an SBS basis. 

6 Effect of PMD on DBP and SBS-SDBP has been reported in 1221 . 

7 Except for 56 GBd NDM links, where we have simulated with K = 8192 
symbols to properly account for the memory in the system. 

8 However, a dispersion compensating fiber (DCF), simulated according to 
G.655 specifications, exhibited similar trends as the FBG. 
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TABLE I 

Number of spans, N, used in DM and NDM links 



Rs [GBd] 

DM 

NDM 

QPSK 

14 

50 

110 

28 

35 

110 

56 

35 

110 

16-QAM 

14 

50 

110 

28 

40 

110 

56 

40 

110 


c is the speed of the light, and P is the average input power 
to each fiber span. We used the same segment lengths for 
simulating the channel and for both DBP and SDBP. 

The number of spans, N, used in each of the scenarios is 
summarized in Table Q] We also considered a non-DM (NDM) 
system, wherein there are no DCMs. The span length used for 
SMF, Lsmf, is 80 km for 16-QAM and 120 km for QPSK0 
FBG with an insertion loss of 3 dB and perfect dispersion 
compensation for the preceding SMF is used. The launch 
power into the DCM is 4 dB below that of the transmission 
fiber, which is compensated for by the EDFA after DCM. The 
noise figure is 5 dB for each of the amplifiers. Ideal low-pass 
filters with one-sided bandwidth of R s are used in EDFAs and 
in the beginning of the receiver. The filtered signal is sent to 
DBP and three different SDBP detectors: 

1) SBS-SDBP from (22); 

2) DD-SDBP for L £ {1,2}, as proposed in Sec. IIV-C1 
and 

3) VA-SDBP for L £ {1, 2}, as proposed in Sec. IIV-BI 

In all SDBP detectors, N p = 500 particles were used to 
generate the results, but we verified that even with N p = 1500, 
similar performance was obtained. The receiver is assumed to 
have perfect knowledge of the polarization state, as well as 
the carrier phase and the symbol timing. 

We consider two performance metrics. To capture the abso¬ 
lute performance of each detector, we determine the symbol 
error rate (SER). To capture the relative performance gain of 
SDBP over DBP, we introduce Gx = SER DB p/SER X -sdbp, 
where X £ {SBS,DD,VA}, and in which SERdbp and 
SERx-sdbp are lowest SERs obtained for the respective al¬ 
gorithms. 

B. Results and Discussion 

The SER as a function of input power is shown in Fig. [2] 
for the 56 Gbaud system with FBG as DCM for (a) QPSK 
and (b) 16-QAM. Due to complexity reasons, VA-SDBP is 
simulated only with L = 1 for 16-QAM. Comparing the SER 
of SBS-SDBP with DD-SDBP/VA-SDBP, we can conclude 
that by taking the residual memory into account, the SER is 
significantly reduced. One can also see that for both VA-SDBP 
and DD-SDBP, increasing L leads to a decrease in the SER. 
We expect this gain to saturate as L increases. We also see 

9 The number of spans and span lengths are selected such that the symbol 
error rate for DBP is less than 0.001. 


that the optimal power (i.e., corresponding to the lowest SER) 
varies for each detector: compared to DBP, the optimal power 
is up to 2 dB larger for QPSK and up to 4 dB larger for 
16-QAM. 




Fig. 2. SER as a function of input power for 56 Gbaud, FBG as 
DCM, for (a) QPSK and (b) 16-QAM. Solid (resp. dashed) lines in 
VA-SDBP and DD-SDBP represent cases when L = 1 (resp. L = 2). 

Similar behavior is observed for other symbol rates, al¬ 
though we do not show all results. Instead, a summary of the 
performance gains is presented in Fig.[3]for QPSK with L = 2 
and for 16-QAM with L = 1. As the complexity of VA-SDBP 
grows exponentially with L, the same value of L was used 
for both DD-SDBP and VA-SDBP to have a fair comparison. 
From Fig. [2 irrespective of the symbol rate, DM or NDM 
links, we observe a clear trend: SER V a-sdbp < SER dd .sdbp < 
SERsbs-sdbp < SER DB p. The VA-SDBP can account for 
both causal and non-causal effect, giving it a performance 
benefit over DD-SDBP. SBS-SDBP ignores both causal and 
non-causal memory and thus exhibits the worst performance 
among the three selected approaches. The decreasing gains 
with increasing symbol rate for SBS-SDBP can be explained 
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Fig. 3. Gains in SER for the proposed algorithms compared to DBR 


as follows. The larger the deviation of the particle clouds, 
given by from a circular symmetric Gaussian, the 

higher are the expected gains in SDBP compared to DBP. As 
the symbol rate increases, the particle clouds in SBS-SDBP 
tend to become more circular Gaussian and hence the gains 
decrease. Also for a DM link, we have observed that the 
particle clouds are less circular Gaussian and hence gains are 
higher for SDBP in DM links compared to NDM links. 

The gains in VA-SDBP for QPSK increase with increas¬ 
ing symbol rate whereas for 16-QAM, the gains increase 
from 14 GBd to 28 GBd and then the gains decrease. This 
maybe due to the use of L = 1 for 16-QAM, which is not 
sufficient to account for the residual memory, especially at 
high symbol rates. The main drawback of VA-SDBP is its 
complexity, which grows exponentially with the memory L. 
So, an interesting case would be to test a low-complexity 
version of VA for 16-QAM with higher memory. DD-SDBP 
is a tradeoff between SBS-SDBP and VA-SDBP, in terms 
of complexity and performance. Irrespective of which SDBP 
approach is used, there is always an improvement in terms of 
SER compared to the traditional DBP algorithm (i.e., Gx > 1). 
This means that NSNI plays an important role in the systems 
under consideration and one can gain significantly by taking 
these interactions into account. The NSNI is more important 
in the DM systems than the NDM systems, so that gains are 
lower in NDM systems. An additional observation that can be 
made from Fig.[3]is that the gains are in general higher for 16- 
QAM than for QPSK as 16-QAM has more nonlinearities than 
QPSK and hence more signal-noise interactions, and thereby 
more gains of SDBP approach compared to DBP. Gains in 
QPSK NDM links (not shown here) turn out to be lower than 
corresponding gains of 16-QAM NDM case. 

VI. Conclusions 

We have extended the SDBP algorithm to account for 
residual memory that may be present due to non-optimal 
processing in SDBP. Specifically, we proposed DD-SDBP and 
VA-SDBP to account for this memory, at an increased cost in 
terms of complexity. Extensive simulations were performed to 
evaluate these methods for 16-QAM and QPSK, and for DM 


and NDM links. Results suggest a significant improvement by 
the proposed detectors for DM links with up to 10 times lower 
SER for QPSK and up to 14 times lower SER for 16-QAM, 
compared to DBP. 

The VA-SDBP can provide optimal decisions on the trans¬ 
mitted sequence, but does so at a high computational cost. 
Alternatives to consider are low-complexity variations of the 
VA, as well as algorithms that provide symbol-wise optimal 
decisions, such as the Bahl-Cocke-Jelinek-Raviv (BCJR) al¬ 
gorithm g2). 

Further gains over the proposed algorithms may be possible 
and remains the topic of ongoing and future research. Lower 
SER can be expected by increasing the memory in VA-SDBP 
until the SER gains saturate. In addition, the use of a matched 
filter is not necessarily optimal. Initial results in this direction 
can be found in |[38l . Finally, in SBS-SDBP, DD-SDBP, and 
VA-SDBP, the particles after matched filtering and sampling 
are approximated with a multivariate Gaussian distribution, 
which need not be a good approximation, especially at high 
input powers. Other types of distributions should be consid¬ 
ered. 
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